Speaker recognition by means of acoustic and phonetically informed GMMs

نویسندگان

Sandro Cumani

Pietro Laface

Farzana Kulsoom

چکیده

In this work we assess the recently proposed hybrid Deep Neural Network/Gaussian Mixture Model (DNN/GMM) approach for speaker recognition considering the effects of the granularity of the phonetic DNN model, and of the precision of the corresponding GMM models, which will be referred to as the phonetic GMMs. The aim of this work is to better understand the contributions of the phonetic information provided by the DNN model with respect to the accuracy of the acoustic GMMs in fitting the distribution of the features associated to a given context-dependent phone state. The testbed for this work was the text-independent speaker recognition task defined by NIST for the 2012 Speaker Recognition Evaluation. Our experiment confirms that the acoustic and the phonetic GMMs are complementary. Thus, their score combination yields very good results if the DNN is trained on data collected in an environment similar to the one that is used for testing. We show, however, that using a single Gaussian per DNN state is not the best choice: the best single system has been obtained balancing the phonetic and acoustic precision of a DNN/GMM system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pitch-dependent GMMs for text-independent speaker recognition systems

Gaussian mixture models (GMMs) and ergodic hidden Markov models (HMMs) have been successfully applied to model short-term acoustic vectors for speaker recognition systems. Prosodic features are known to carry information concerning the speaker’s identity and they can be combined with the short-term acoustic vectors in order to increase the performance of the speaker recognition system. In this ...

متن کامل

Fuzzy Gaussian mixture models for speaker recognition

A fuzzy clustering based modification of Gaussian mixture models (GMMs) for speaker recognition is proposed. In this modification, fuzzy mixture weights are introduced by redefining the distances used in the fuzzy c-means (FCM) functionals. Their reestimation formulas are proved by minimising the FCM functionals. The experimental results show that the fuzzy GMMs can be used in speaker recogniti...

متن کامل

Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition

This paper investigates the effectiveness of the DAEM (Deterministic Annealing EM) algorithm in acoustic modeling for speaker and speech recognition. Although the EM algorithm has been widely used to approximate the ML estimates, it has the problem of initialization dependence. To relax this problem, the DAEM algorithm has been proposed and confirmed the effectiveness in artificial small tasks....

متن کامل

GMM based clustering and speaker separability in the Timit speech database

Speaker recognition on the 630 speaker Timit speech database, using maximum probability selection with a simple Gaussian Mixture Model (GMM) for the data distribution for each speaker, gives above 99% correct recognition. In contrast, a powerful classifier such as a Multi Layer Perceptron (MLP), trained to estimate speaker probabilities, even on a small subset of speakers often performs no bett...

متن کامل

Acoustic language identification using fast discriminative training

Gaussian Mixture Models (GMMs) in combination with Support Vector Machine (SVM) classifiers have been shown to give excellent classification accuracy in speaker recognition. In this work we use this approach for language identification, and we compare its performance with the standard approach based on GMMs. In the GMM-SVM framework, a GMM is trained for each training or test utterance. Since i...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Speaker recognition by means of acoustic and phonetically informed GMMs

نویسندگان

چکیده

منابع مشابه

Pitch-dependent GMMs for text-independent speaker recognition systems

Fuzzy Gaussian mixture models for speaker recognition

Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition

GMM based clustering and speaker separability in the Timit speech database

Acoustic language identification using fast discriminative training

عنوان ژورنال:

اشتراک گذاری